A Framework for Learning Knowledge-Powered Word Embedding
نویسندگان
چکیده
Neural network techniques are widely applied to obtain high-quality distributed representations of words, i.e., word embeddings, to address text mining and natural language processing tasks. Recently, efficient methods have been proposed to learn word embeddings from context that captures both semantic and syntactic relationships between words. However, it is challenging to handle unseen words or rare words with insufficient context. In this paper, we propose a framework that can leverage general pairwise word similarity to address these challenges. As an example, we propose to take advantage of seemingly less obvious but essentially important morphological word similarity to show the power of our framework. In particular, we introduce a novel neural network architecture that leverages both contextual information and morphological word similarity to learn word embeddings. Experiments on an analogical reasoning task demonstrates that the proposed method can greatly enhance the effectiveness of word embeddings.
منابع مشابه
Knowledge-Powered Deep Learning for Word Embedding
The basis of applying deep learning to solve natural language processing tasks is to obtain high-quality distributed representations of words, i.e., word embeddings, from large amounts of text data. However, text itself usually contains incomplete and ambiguous information, which makes necessity to leverage extra knowledge to understand it. Fortunately, text itself already contains welldefined ...
متن کاملSolving Verbal Questions in IQ Test by Knowledge-Powered Word Embedding
Verbal comprehension questions appear very frequently in Intelligence Quotient (IQ) tests, which measure human’s verbal ability including the understanding of the words with multiple senses, the synonyms and antonyms, and the analogies among words. In this work, we explore whether such tests can be solved automatically by the deep learning technologies for text data. We found that the task was ...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملSolving Verbal Comprehension Questions in IQ Test by Knowledge-Powered Word Embedding
Intelligence Quotient (IQ) Test is a set of standardized questions designed to evaluate human intelligence. Verbal comprehension questions appear very frequently in IQ tests, which measure human’s verbal ability including the understanding of the words with multiple senses, the synonyms and antonyms, and the analogies among words. In this work, we explore whether such tests can be solved automa...
متن کاملPerform Three Data Mining Tasks with Crowdsourcing Process
For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...
متن کامل